首页> 外文OA文献 >Sparse Supernodal Solver Using Block Low-Rank Compression: design, performance and analysis
【2h】

Sparse Supernodal Solver Using Block Low-Rank Compression: design, performance and analysis

机译:使用块低秩压缩的稀疏超节点解算器:设计,性能和分析

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

This paper presents two approaches using a Block Low-Rank (BLR) compressiontechnique to reduce the memory footprint and/or the time-to-solution of the sparse supernodalsolver PaStiX. This flat, non-hierarchical, compression method allows to take advantage of thelow-rank property of the blocks appearing during the factorization of sparse linear systems, whichcome from the discretization of partial differential equations. The first approach, called MinimalMemory, illustrates the maximum memory gain that can be obtained with the BLR compressionmethod, while the second approach, called Just-In-Time, mainly focuses on reducing the com-putational complexity and thus the time-to-solution. Singular Value Decomposition (SVD) andRank-Revealing QR (RRQR), as compression kernels, are both compared in terms of factorizationtime, memory consumption, as well as numerical properties. Experiments on a single node with24 threads and 128 GB of memory are performed to evaluate the potential of both strategies. Ona set of matrices from real-life problems, we demonstrate a memory footprint reduction of up to 4times using the Minimal Memory strategy and a computational time speedup of up to 3.5 timeswith the Just-In-Time strategy. Then, we study the impact of configuration parameters of theBLR solver that allowed us to solve a 3D laplacian of 36 million unknowns a single node, while thefull-rank solver stopped at 8 million due to memory limitation.
机译:本文介绍了两种使用块低秩(BLR)压缩技术来减少稀疏超节点求解器PaStiX的内存占用量和/或求解时间的方法。这种平坦的,非分层的压缩方法允许利用稀疏线性系统因式分解时出现的块的低秩属性,这是由偏微分方程的离散化引起的。第一种方法称为MinimalMemory,说明可以使用BLR压缩方法获得的最大内存增益,而第二种方法,称为Just-In-Time,主要着重于降低计算复杂度,从而缩短解决时间。 。作为压缩内核,对奇异值分解(SVD)和排名揭示QR(RRQR)进行了因子分解时间,内存消耗以及数值属性方面的比较。在具有24个线程和128 GB内存的单个节点上进行了实验,以评估这两种策略的潜力。在一组来自现实生活问题的矩阵上,我们展示了使用最小内存策略最多可将内存占用量减少4倍,而使用即时策略则可将计算时间最多缩短3.5倍。然后,我们研究了BLR求解器的配置参数的影响,该参数使我们能够在单个节点上求解3600万个未知数的3D拉普拉斯算子,而由于内存限制,全等级求解器在800万个点处停止。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号